Assessing the invasive potential of different source populations of ragweed (Ambrosia artemisiifolia L.) through genomically informed species distribution modelling

Abstract The genetic composition of founding populations is likely to play a key role in determining invasion success. Individual genotypes may differ in habitat preference and environmental tolerance, so their ability to colonize novel environments can be highly variable. Despite the importance of genetic variation on invasion success, its influence on the potential distribution of invaders is rarely investigated. Here, we integrate population genomics and ecological niche models (ENMs) into a single framework to predict the distribution of globally invasive common ragweed (Ambrosia artemisiifolia) in Australia. We identified three genetic clusters for ragweed and used these to construct cluster‐specific ENMs and characterize within‐species niche differentiation. The potential range of ragweed in Australia depended on the genetic composition and continent of origin of the introduced population. Invaders originating from warmer, wetter climates had a broader potential distribution than those from cooler, drier ones. By quantifying this change, we identified source populations most likely to expand the ragweed distribution. As prevention remains the most effective method of invasive species management, our work provides a valuable way of ranking the threat posed by different populations to better inform management decisions.

We initially compared different ensemble niche models using the R\biomod2 package (Thuiller et al., 2021) in R-4.3.1.For each method, we fitted models to GBIF records from the given range and 10,000 randomly sampled pseudoabsences.We evaluated model performance by projecting models to Australia and estimating ROC values using Australian occurrence records and 10,000 randomly sampled Australian pseudoabsences (Table S1).Most methods performed similarly, with random forests performing the best.However, we opted to use maximum entropy models because they were also among the best-performing and is more straightforward to implement.

Figure S2 .
Figure S2.Effect of varying the number of ancestral populations L on model log likelihood.A) Model log likelihood L(L).B) The first order rate of change in log likelihood L'(L).C) The absolute value of the second order rate of change |L''L|.D) The mean of the second order rate of change in log likelihood standardised by the standard deviation of the model log likelihoods.Red lines indicate the standard deviation of model log likelihoods based on 10 runs.

Figure S3 .
Figure S3.Distribution of schoeners D for between-range comparisons based on 200 random permutations.The shaded region is the middle 95%-ile of the null distribution.The observed D value is reported on the top-right of each panel.The shaded region is the middle 95%-ile of the null distribution.The red line indicates the observed D value and is absent if D fell outside the null distribution.

Figure S4 .
Figure S4.Distribution of Schoener's D for various cluster-range comparisons based on 200 random permutations.The observed D value is reported on the top-right of each panel.The shaded region is the middle 95%-ile of the null distribution.The red line indicates the observed D value and is absent if D fell outside the null distribution.

Figure S5 .
Figure S5.Mobility-oriented parity (MOP; blue) maps overlaid on the predicted distribution of ragweed under different invasion scenarios involving the introduction of a single cluster from different ranges to Australia.For each subplot, the opacity of the MOP map depends on the number of bioclimatic variables for which values in Australia fell outside the observed range.Higher opacity indicates a greater degree of model extrapolation.

Figure S6 .
Figure S6.Mobility-oriented parity (MOP) maps overlaid on the predicted distribution of ragweed under different invasion scenarios involving the introduction of a single cluster from different ranges to Australia.For each subplot, the opacity of the MOP map depends on the number of land use/cover variables for which values in Australia fell outside the observed range.Higher opacity indicates a greater degree of model extrapolation.

Table S1 .
Mean ROC for models trained on different ranges and validated in Australia.

Table S2 .
Description of environmental predictors used in modelling and mobilityoriented parity (MOP) analyses.

Table S3 .
Effect of increasing number of ancestral populations L on five-fold crossvalidation error of ADMIXTURE performed on North American samples.

Table S4 :
Number of observations for each cluster-range combination

Table S5 .
Model AUC for models created using GBIF records with different training datasets and projected regions

Table S6 .
Mean values of bioclimatic and land use/cover variables for different clusters.Different letters indicate statistically significant differences in mean values based on a two-sample Welch's t-test at 95% confidence.

Table S7 .
Schoener's D estimate of niche overlap between cluster-ranges based on models created using 19 bioclimatic and 8 land use/cover variables (NA = North America, EU = Europe, AU = Australia).